FAIR Science: A Model for Enhanced Data Management
The Significance
The FAIR guiding principles for scientific data management and stewardship underscore the importance of making data Findable, Accessible, Interoperable, and Reusable. These principles form the cornerstone of a scientific ecosystem that prioritizes transparency, data reuse, and collaborative efforts, ultimately expediting breakthroughs in science. Furthermore, commencing in 2023, the NIH Data Management and Sharing Policy mandates data sharing for NIH-supported studies. The Gray Foundation DCC plays a pivotal role in enabling consortium teams to not only comply with but also exceed NIH data sharing policies, thereby advancing the pursuit of superior scientific endeavors.
Benefits to Consortium Teams
This commitment to FAIR science yields a multitude of advantages for consortium teams:
-
Enhanced Collaboration in the Present: Consortium teams experience increased collaborative productivity as they gain the ability to effortlessly locate and utilize shared data, fostering synergy in current research endeavors.
-
Empowered Future Grant Proposals: The practice of data sharing positions consortium teams favorably when applying for grants, particularly those focused on Team Science. A history of effective data sharing demonstrates a commitment to the broader scientific community, aligning with evolving expectations in the field.
In 2010, a group of scientists advocated for the consideration of all products stemming from research grants, extending beyond traditional peer-reviewed publications. This encompasses the sharing of raw data and self-published results through digital platforms and social media. Notably, this movement has already influenced policy changes, with organizations like the NSF expanding their requirements to include products such as datasets, software, patents, and copyrights (Funding and Evaluation of Team Science).
-
Heightened Impact and Citations: By making data, protocols, and related resources accessible, consortium teams open doors for other researchers to employ their work, leading to increased recognition and citations from the broader scientific community.
The journey toward FAIR science necessitates the collection of comprehensive metadata to describe and enhance the findability and reusability of data and resources. A well-defined data model determines the prioritized collection of this metadata, guiding consortium teams toward excellence in data management and sharing practices.
Development Process
The DCC develops the model using existing standards while also consulting the contributing teams for specialized data. A broader development process, when the model affects the wider consortium, involves a Request for Comments (RFC) process. For example, please see the public first RFC document (archived/closed) that has helped determine the current core model. Continued development, if necessary, can involve additional RFCs.
Data Model Source
The data model source is versioned here.
Capturing Metadata Through Templates
Data are captured through spreadsheet templates, available as either Google Sheets or Excel (in cases where contributors can't access Google products due to institutional policies).